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IN THE CLAIMS 

1 . (Currently Amended) A method of processing audio-based data associated with a 
particular language, the method comprising the steps of: 
storing the audio-based data; 

generating a textual representation of the audio-based data, the textual representation being 
in the form of one or more semantic units corresponding to the audio-based data , wherein a semantic 
unit comprises a minimal unit of language having a semantic meaning ; and 

indexing the one or more semantic units and storing the one or more indexed semantic units 
for use in searching the stored audio-based data in response to a user query. 

2.. (Original) The method of claim 1, wherein the semantic unit is a syllable. 

3. (Original) The method of claim 2, wherein the syllable is a phonetically based syllable. 

4. (Original) The method of claim 1, wherein the semantic unit is a morpheme. 

5. (Original) The method of claim 1, wherein the generating step comprises decoding the 
audio-based data in accordance with a speech recognition system. 

6. (Original) The method of claim 5, wherein the speech recognition system employs a 
semantic unit based language model. 

7. (Original) The method of claim 1 , wherein the indexing step comprises time stamping the 
one or more semantic units. 

8. (Original) The method of claim 1; wherein the searching step comprises: 
processing the user query to generate one or more semantic units representing the information 

that the user seeks to retrieve; 



2 




Attorney Docket No. YQ999-426 

searching the one or more indexed semantic units to find a substantial match with the one 
or more semantic units associated with the user query; and 

retrieving one or more segments of the audio-based data using the one or more indexed 
semantic units that match the one or more semantic units associated with the user query. 

9. (Original) The method of claim 8 5 wherein the searching step further comprises presenting 
the retrieved data to the user. 

10. (Original) The method of claim 1, wherein the particular language is an Asian based 
language. 

11. (Original) The method of claim 10, wherein the particular language is Chinese. 

12. (Original) The method of claim 11, wherein the semantic unit is a Chinese character. 

13. (Original) The method of claim 1, wherein the particular language is a Slavic based 
language. 

14. (Original) The method of claim 1, wherein the one or more semantic units are indexed 
according to speaker attributes. 

15. (Original) The method of claim 1, wherein the one or more semantic units are indexed 
according to at least one of when the audio based data was produced and where the audio based data 
was produced. 

1 6. (Original) The method of claim 1 , further comprising the step of storing video based data 
associated with the audio based data for use in searching the stored audio based data and the video 
based data in response to a user query. 
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17. (Original) The method of claim 16, wherein the searching step includes a hierarchical 
search routine. 

18. (Original) The method of claim 1, wherein the generating step comprises 
stenographically transcribing the audio-based data to generate the textual representation. 

19. (Currently Amended) Apparatus for processing audio-based data associated with a 
particular language, the apparatus comprising: 

at least one processor operative to: (i)store the audio-based data; (ii) generate a textual 
representation of the audio-based data, the textual representation being in the form of one or more 
semantic units corresponding to the audio-based data , wherein a semantic unit comprises a minimal 
unit of language having a semantic meaning ; and (iii) index the one or more semantic units and store 
the one or more indexed semantic units for use in searching the stored audio-based data in response 
to a user query. 

20. (Currently Amended) An audio-based data indexing and retrieval system for processing 
audio-based data associated with a particular language, the system comprising: 

memory for storing the audio-based data; 

a semantic unit based speech recognition system for generating a textual representation of 
the audio-based data, the textual representation being in the form of one or more semantic units 
corresponding to the audio-based data , wherein a semantic unit comprises a minimal unit of language 
having a semantic meaning ; 

an indexing and storage module, operatively coupled to the semantic unit based speech 
recognition system and the memory, for indexing the one or more semantic units and storing the one 
or more indexed semantic units; and 

a search engine, operatively coupled to the indexing and storage module and the memory, 
for searching the one or more indexed semantic units for a match with one or more semantic units 
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associated with a user query, and for retrieving the stored audio based data based on the one or more 
indexed semantic units. 

21. (Previously Added) The method of claim 5, wherein the speech recognition system 
employs a syllable language model. 

22. (Previously Added) The method of claim 2 1 , wherein production of the syllable language 
model comprises the steps of: 

transcribing audio data to generate syllables; 

deriving conditional probabilities of distribution based on the generated syllables; and 
using syllable counts and the conditional probabilities to construct the syllable language 

model. 

23. (Previously Added) The method of claim 1, wherein the user query comprises a word. 

24. (Previously Added) The method of claim 23, wherein the searching step further 
comprises transforming the word into a sequence of syllables using a text-to-phonetic syllable map. 

25. (Previously Added) The method of claim 3, wherein a phonetically-based syllable 
comprises a toneme. 

26. (Previously Added) The method of claim 3, wherein two or more different 
pronunciations are associated with a phonetically-based syllable. 

27. (Previously Added) The method of claim 1, wherein the generating step comprises 
producing the textual representation via stenography. 
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28. (Previously Added) The method of claim 1, wherein the searching step comprises use 
of a hierarchical index. 



29. (Previously Added) The method of claim 1, wherein the searching step comprises use 
of an automatic boundary marking system. 



